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ABSTRACT 



Unsupervised pattern recognition algorithms support the existence of three 
gamma-ray burst classes; Class I (long, large fluence bursts of intermediate spec- 
\ tral hardness), Class II (short, small fluence, hard bursts), and Class III (soft 

\ bursts of intermediate durations and fiuences). The algorithms surprisingly as- 

sign larger membership to Class III than to either of the other two classes. A 
known systematic bias has been previously used to explain the existence of Class 
III in terms of Class I; this bias allows the fiuences and durations of some bursts 
to be underestimated (Hakkila et al., ApJ 538, 165, 2000). We show that this 
bias primarily affects only the longest bursts and cannot explain the bulk of the 
Class III properties. We resolve the question of Class III existence by demon- 
strating how samples obtained using standard trigger mechanisms fail to preserve 
the duration characteristics of small peak flux bursts. Sample incompleteness is 
thus primarily responsible for the existence of Class III. In order to avoid this 
incompleteness, we show how a new dual timescale peak flux can be defined in 
terms of peak flux and fluence. The dual timescale peak flux preserves the du- 
ration distribution of faint bursts and correlates better with spectral hardness 
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(and presumably redshift) than either peak flux or fluence. The techniques pre- 
sented here are generic and have apphcabihty to the studies of other transient 
events. The results also indicate that pattern recognition algorithms are sensi- 
tive to sample completeness; this can influence the study of large astronomical 
databases such as those found in a Virtual Observatory. 

Subject headings: gamma-rays: bursts — methods: data analysis, statistical — 
instrumentation: miscellaneous 



1. Introduction 

In recent years, data mining algorithms have been used to aid the process of scientiflc 
classiflcation. Data mining is the extraction of potentially useful information from data using 
machine learning, statistical, and visualization techniques. Pattern recognition algorithms 
(or classiflers) are data mining tools that search for clusters in complex, multi-dimensional 
spaces of attributes (observed and/or measured properties). These algorithms typically oper- 
ate in one of two modes: supervised (in which the classifler is trained with known classiflca- 
tion instances) and unsupervised (in which classification occurs without training examples). 
Algorithms are designed to identify data patterns such as clustering and/or correlations, 
but their limitations must also be understood: it is up to the scientist to interpret physical 
mechanisms responsible for producing identified clusters. Clusters found by classifiers can 
represent source populations; this happens when the class properties are produced by phys- 
ical mechanisms pertaining to the sources. Clusters can also result from the way in which 
source properties are measured; sampling biases, systemic instrumentation errors, and corre- 
lated properties can all force data to cluster and thus give the appearance of class structure 
when there is none. 

Data mining algorithms can be applied to gamma-ray burst classiflcation. Two gamma- 
ray burst classes have been recognized for years (Mazets et al. 1981; Norris et al. 1984; 
Klebcsadcl 1992; Hurley 1992; Kouvehotou et al. 1993) on the basis of duration and spec- 
tral hardness. Class 1 (Long) bursts are longer, spectrally softer, and have larger fluences 
than Class 11 (Short) bursts. Recent classification schemes have used data collected by 
BATSE (the Burst And Transient Source Experiment on NASA's Compton Gamma-Ray 
Observatory; CGRO) (Meegan et al. 1992) because this large database (2704 bursts ob- 
served between 1991 and 2000) was collected by a single instrument with known instrumen- 
tal characteristics. Three attributes of BATSE gamma-ray burst data have been identified 
as being significant (using techniques such as principal component analysis) in delineating 
gamma-ray burst classes (Mukherjee et al. 1998; Bagoly et al. 1998; Hakkila et al. 2000a): 
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duration T90 (the time interval during which 90% of a burst's emission is detected), fluence 
S (time integrated flux in the 50 to 300 keV spectral range), and spectral hardness HR321 
(the 50 to 300 keV fluence divided by the 25 to 50 keV fluence). Logarithmic measures of 
these values are typically used because classes arc more clearly delineated when attributes 
are defined logarithmically. Historically, bursts with durations T90 < 2 seconds have been 
typically considered to belong to Class II. 

Data mining techniques allow a third gamma-ray burst class to be identified in BATSE 
data. Three classes are preferably recovered instead of two by both statistical clustering 
techniques (Mukherjee et al. 1998; Horvath 1998) and unsupervised pattern recognition al- 
gorithms (Roiger et al. 2000; Balastegui et al. 2001; Rajaniemi and Mahonen 2002). The 
third class forms at the boundary between Class I and Class II, and primarily contains the 
softest and smallest fluence bursts from Class I. Since Class II appears to be relatively un- 
changed by the detection of the third class, the three classes are called Class II (short, small 
fluence, hard bursts). Class I (long, large fluence bursts of intermediate hardness), and Class 
III (intermediate duration, intermediate fluence, soft bursts; also referred to as Intermediate 
bursts) . 

The boundaries between classes are fuzzy, as some bursts are not easily assigned to a 
speciflc class. Different data mining algorithms do not necessarily assign individual bursts 
to the same classes because each classifier operates under different assumptions concerning 
correlations between data attributes and how these relate to clustering criteria. Some clas- 
sifiers arc designed to work with nominal data while others are not; some employ Bayesian 
while others employ frequentist statistics; some assume a priori distributions of attribute 
values while others do not. The results of any classifier can change if the size and makeup 
of the data set is altered. Data errors can influence the results since few classiflers cur- 
rently include measurement error information in their analyses. However, irreproducibility 
is not necessarily a fault of machine learning methodology. Each classifier provides different 
insights into the way the data are structured. For any given data set, there is a good pos- 
sibility that some critical experiment or observation has not been performed, or that some 
key measurements have yet to be made, or that the relative importance of some attribute 
has been underestimated or overestimated. There is no correct way of classifying a dataset 
because the usefulness of the classification depends on the insights gained from it by the user. 

In a previous application of supervised classification (Hakkila et al. 2000a) to gamma- 
ray burst data we hypothesized that Class HI does not necessarily represent a separate 
source population. Instead, instrumental and sampling biases have been proposed as a 
way in which some Class I bursts can take on Class HI characteristics. Due to a known 
correlation between hardness and intensity (Paciesas et al. 1992; Mitrofanov et al. 1992; 
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Nemiroff et al. 1994; Atteia et al. 1994; Dezalay et al. 1997; Qin et al. 2001), small fluence 
Class I bursts are typically softer than bright Class I bursts; this is supported by prin- 
cipal component analysis (Bagoly et al. 1998). Since the correlation results from a shift 
to smaller average peak spectral energy {Ep) at lower peak flux but not from changes in 
the average low-energy spectral index (a) or the average high-energy spectral index (/?) 
(Mallozzi et al. 1995; Hakkila et al. 2000a; Paciesas et al. 2002), this correlation has been 
attributed to the softer bursts being generally at larger cosmological redshift. (This con- 
clusion may not necessarily be correct because a broad range of gamma-ray burst luminosi- 
ties is suggested from redshifts of gamma-ray burst afterglows (van Paradijs et al. 2000); 
however, it should be noted that only a small afterglow sample is available.) Addition- 
ally, fluences and durations of some Class I bursts can systematically be underestimated 
(Koshut et al. 1996; Bonncll et al. 1997; Hakkila et al. 2000a); we refer to this as the flu- 
ence duration bias (Hakkila et al. 2000a). Simply put, fluences and durations of some Class 
1 bursts (particularly those with the smallest peak fluxes) can be underestimated due to the 
unrecognizability of low signal-to-noise emission; combined with their spectral softness, this 
gives them characteristics consistent with Class III. 

Unfortunately, the fluence duration bias has been difficult to quantify. The amounts 
by which the fluence and duration of an individual burst arc affected depend on the fitted 
background rates and estimated burst durations at all energies; to remove the background 
properly assumes a priori knowledge of the burst's intrinsic temporal and spectral structure. 
Such a priori knowledge can only be acquired in the absence of background, and gamma- 
ray burst observations are inherently noisy. Very high signal-to- noise estimates of a burst's 
temporal and spectral structure can only be obtained for a small number of the bursts with 
the largest fiucnces. These well-measured quantities are not entirely intrinsic; it appears 
that even the brightest bursts require systematic correction because they are at large redshift 



Our objective is to determine whether or not the fiuence duration bias can account for 
the number of bursts with Class 111 characteristics. In order to do this, we determine the total 
number of bursts that exhibit Class 111 characteristics using several different unsupervised 
classifiers. Then, we statistically model the suspected bias and determine whether it is strong 
enough to produce the Class III bursts. 

A number of pertinent questions will have to be addressed in pursuing this objective: 
Do theoreticians need to develop models for one, two, or three gamma-ray burst classes? 
How can data mining techniques be used to aid scientific classification? Are systematic 
effects present in data collected by BATSE or other gamma-ray burst experiments that alter 
classification structures? Can these effects be understood? Can information on intrinsic 
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properties of the source population be extracted if these effects are present? Can future 
instruments be designed to minimize or ehminate these effects? 

2. Class III and the Fluence Duration Bias 

2.1. The Significance and Size of Class III 

We systematically compare the output of various unsupervised algorithms in conjunction 
with a homogeneous gamma-ray burst data set obtained with one set of instrumental settings. 
We use the online gamma-ray burst ToolSHED (Haglin et al. 2000) (SHell for Expeditions 
in Data mining) that we are developing to aid our analysis. This ToolSHED (currently 
ready for pre-beta testers at http://grb.mnsu.edu/grbts/) provides a suite of supervised 
and unsupervised data mining tools and a large database of preprocessed gamma-ray burst 
attributes. It allows users to classify data using more than one algorithm in order to identify 
consistencies in the different classification techniques and thereby gain better insight into 
the heterogeneous nature of the data. 

In order to further minimize the effects of instrumental biases, we have limited our 
database to bursts detected with a homogeneous set of BATSE trigger criteria. The database 
consists of bursts from the BATSE Current Burst Catalog (http:/ /f64.nsstc.nasa.gov/batse/ 

grb/catalog/current/). Bursts included are non- overwriting and non-overwritten bursts {e.g. 
those whose BATSE readout periods did not overlap detectable bursts immediately preceding 
or following their detection) triggering at least two BATSE detectors in the 50 to 300 keV 
energy range with the trigger threshold set 5.5o" above background on any of the three trigger 
timescales. We require all classifiers to use only the three attributes of log(T90), log(HR321), 
and log(S). 

We apply four unsupervised ToolSHED algorithms with different approaches to cluster- 
ing. These algorithms are ESX, a Kohonen neural network, the unsupervised EM algorithm, 
and the unsupervised Kmeans algorithm. 

ESX (Roiger et al. 1999) is a classifier that forms a three-level tree structure. The root 
level of the tree contains summary information for all bursts. The second (concept) level of 
the tree sub-divides the root level into clusters formed as a result of applying a similarity- 
based evaluation function. The third tree level holds the individual bursts. 

A Kohonen neural network (Kohonen 1982) architecture is represented as a collection of 
input and output units. During training, the input units iteratively feed the burst instances 
to the output units. The output units compete for the burst instances. The output units 
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coUecting the most bursts are saved. The saved units represent the clusters found within the 
data. 

The unsupervised EM algorithm (Dempster et al. 1977) assumes that the attribute 
space can be subdivided into a predetermined number of normally distributed clusters. An 
initial guess is made as to the properties of each random distribution, and this guess is used 
to calculate probabilities that bursts belong to each cluster. The cluster characteristics are 
iteratively adjusted until all clusters are optimally-defined. 

The Kmeans (Lloyd 1982) algorithm randomly selects K data points as initial cluster 
centers. Each instance is then placed in the cluster to which it is most similar. Once all 
instances have been placed in their appropriate cluster, the cluster centers are updated by 
computing the mean of each new cluster. The process continues until an iteration of the 
algorithm shows no change in the cluster centers. 

Predetermined classification significance helps define the number of classes that can 
be recovered. When allowed to find an optimum number of classes based on a default 
significance, the aforementioned classifiers typically recover three to four burst classes as 
opposed to the two traditionally accepted classes. This indicates that the two traditional 
classes are not considered to be the optimal solution. 

We force all four classifiers to recover two, three, and four classes because we hope that 
by studying the properties of these force-recovered classes we can determine why the three- 
class solution is preferred over the two-class solution. The properties of three force-recovered 
classes are indicated in table 1. The properties of these classes are similar to those obtained 
using other clustering techniques (Mukherjee et al. 1998; Horvath 1998; Roiger et al. 2000; 
Balastegui et al. 2001; Rajaniemi and Mahonen 2002), so we again refer to these as Class 
I (Long), Class II (Short), and Class III (Intermediate). However, these previous results 
typically place fewer bursts in Class III than Class I, whereas three of our four classifiers 
place the largest number in Class III. Therefore, our analysis finds Class III to be the 
dominant class. 

In order to explain why the percentage of Class III members is so large, we examine the 

placement of Class III bursts when classifiers are forced to recover only two classes (Short 
and Long). The results are remarkably consistent: all four classifiers fail to clearly delineate 
the traditionally-accepted Short and Long classes, and each places a large number of soft 
Class 111 bursts in with the hard Short class (see Figure 1). This is surprising, since Class 
III clustering is not obvious to the naked eye in the hardness vs. duration parameter space 
whereas Short and Long burst clustering is. The hardness vs. duration boundary is not 
chosen by the classifiers because a sharper one exists in the fluence vs. duration parameter 
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space (Figure 2); the boundary separating short faint bursts from long bright bursts is more 
significant than that separating short hard bursts from long soft bursts. 

It is surprising that fluence plays such an important role in the classification. First, 
fluence is an extrinsic attribute (since it represents a convolution of a burst's luminosity and 
distance) as opposed to hardness and duration, so there is no reason why fluence clustering 
should relate to any physical differences between burst classes. Second, one would intuitively 
expect a burst with a longer duration to have a larger fluence, indicating that fluence and 
duration should be highly-correlated attributes. Thus, clustering in the duration attribute 
can also cause clustering in the fluence attribute, and the use of fluence as a classification 
attribute magnifies the clustering importance of duration relative to hardness. The break 
between short faint and long bright bursts therefore appears due in part to the use of fluence, 
an attribute which is of questionable value. 

To determine if the fiuence bias can be removed, we eliminate this attribute and perform 
the classification using only log(T90) duration and log(HR321) hardness ratio. Even without 
fluence, the classifiers again prefer to recover three classes instead of two, and Class III is 
not diminished in size. Examination of the three class properties indicates that log(T90) 
has been used almost exclusively to dehneate the classes; hardness is almost ignored by the 
classifiers. This is surprising, since the eye tends to delineate two burst classes. We check 
this result by supplying only the TOO attribute to the classifiers. Indeed, the classifiers again 
return three classes rather than two (Class 1 bursts have T90 > 6sec., Class 11 bursts have 
T90 < 1.4 sec, and Class 111 bursts have 1.4 sec. < T90 < 6 sec). However, the size of Class 
III has been diminished in this reclassification and it is no longer the largest class; this result 
is consistent with that obtained earlier using only the duration attribute (Horvath 1998). 
We conclude that strong evidence exists for the three-class structure. 

Before accepting the new class as a separate source population, we must try to dis- 
count alternative explanations concerning its existence. It is possible that the classifiers 
have detected a data cluster resulting from the way that the data have been collected, rather 
than from a separate and distinct source population. We consider it unlikely that Class III 
represents a statistical anomaly since it has been found by four classifiers using different 
algorithms, and since stringent requirements have been imposed for each classifier to find 
additional classes. Thus, Class III could result from a systematic effect such as an instru- 
mental or sampling bias. The suspected bias appears to primarily affect duration and the 
coupled yet extrinsic attribute of fluence. 

This conclusion leads us again to examine the hypothesized fluence duration bias. This 
bias could provide a mechanism for underestimating both fluence and duration of some 
Class I bursts (particularly faint soft ones) , and could cause these bursts to take on Class III 
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characteristics. However, with the increased size of Class III, it is reasonable to think that 
the bias might be strong. 

2.2. Inadequacy of the Fluence Duration Bias Model to Explain Class III 

Properties 

In an attempt to quantify the fluence duration bias, we have developed a simple model 
of the bias that can be applied statistically The model only influences Class I and Class III 
bursts (as defined by the EM algorithm), since the bias has not been hypothesized to alter 
Class II properties. In a previous work (Hakkila et al. 2000a) we estimated the maximum 
amount by which the fluences and durations of five bright bursts might need to be corrected 
if their signal-to-noise ratios were reduced; our simple model averages these values to obtain 
maximum corrections of fluence and duration as functions of ]9io24 (peak flux measured on 
the 1024 ms timescale). We do not know how much the fluence of an individual burst might 
need to be corrected, therefore we assume that the fluence of each burst should be corrected 
between ergs cm~^ sec~^ and the maximum fluence correction -S'max, and that the duration 
of each burst should be corrected between seconds and the maximum duration correction 
T90inax- The amount of the maximum correction is dependent upon the signal-to- noise ratio 
and thus on the peak flux; the suspected bias is more pronounced for bursts with peak fluxes 
near the detection threshold. We naively assume a probability p that each burst's measured 
fluence and duration will be altered with equal probability in the intervals [0, log(5')max] and 
[0, log(T90)max]- Thus, the modeled amount by which an individual burst's fluence would be 
affected by the bias is plog(S')max and the amount by which its duration would be affected is 
plog(T90)inax- The problem can be inverted to estimate how much observed burst fluences 
and durations have been underestimated as a function of pio24- 

If the fluence duration bias produces Class III properties, then (1) the faint Class I and 
Class III bursts (as measured by P1024) should show evidence of having their fluences (and 
durations) systematically underestimated, and (2) no evidence of this bias should be present 
if this combined distribution has been properly corrected for the effect. We would thus hke 
to compare both the observed distribution and the corrected distribution with the "true" 
distribution. Unfortunately, we do not know the "true" distribution. 

If we assume that the bias has not affected the fluence and duration distributions of 

bright bursts (as measured by P1024), then we can compare the corrected and uncorrected 
distributions of faint bursts to the observed distributions of bright bursts. The comparison 
can be made once we identify how the fluence and duration distributions scale with peak 
flux. 
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If a given burst's intensity were decreased (either by decreasing the burst's luminosity or 
if the burst were observed at a larger distance), then its fluence would decrease proportionally 
to its peak flux. This generic statement is false only in the presence of sampling and/or 
instrumental biases. The effect of time dilation due to cosmological expansion is an example 
of a sampling bias that can systematically affect ffucnce count rates differently than peak flux 
count rates. Since we measure the peak flux and fluence in the same energy channels, the 
primary source of bias is that the observed peak flux can be as httle as {l + z)~^ of its actual 
value due to time dilation, whereas the fluence would not be expected to be lessened. This 
bias would cause the peak flux of distant bursts to be small relative to the fluence; note that 
this bias cannot explain Class III characteristics, since Class III bursts have fluences that are 
small relative to their peak fluxes. In going from bright bursts to faint bursts, a decreasing 
signal-to-noise ratio can cause fluences of faint bursts to decrease non-proportionally to peak 
fluxes; this is an example of a statistical (rather than systematic) instrumental bias. 

Sampling biases can cause the faint burst distribution (as measured by either peak flux 
or fluence) to be different than the bright burst distribution. Trigger biases can cause bursts 
with certain characteristics to trigger disproportionately relative to other bursts. However, 
trigger biases that have been proposed prior to this manuscript do not appear to alter the 
makeup of the BATSE dataset by large amounts (Meegan et al. 2000). We therefore assume 
in testing the fluence duration bias that it is primarily responsible for causing a burst's fluence 
to change not in proportion to the change in its peak flux, and that the faint burst distribution 
of Class I + III bursts would be the same as the bright distribution in the absence of this 
effect. 

The distribution of burst fluences at a given peak flux indicates bursts with different 
time histories; greater fluence typically belongs to longer bursts with more pulses and smaller 
fluence typically belongs to shorter bursts with fewer pulses. If these burst peak fluxes were 
all decreased by the same amount, then their fluences would decrease proportionally along a 
line deflned by log(S')iinc = log(pio24) + R (where R is an arbitrary constant). The difference 
A log(S') = log(S')iinc — log(S')obs obtained for each burst indicates the fluence offset of each 
burst from the line given its peak flux. The distribution of A log(S') can be examined for 
bright bursts {e.g. those presumably unaffected by the bias) and for faint bursts {e.g. those 
affected by the bias). In the absence of any biases, the faint distribution will be similar to 
the bright distribution. If the fluence duration bias is present, then the faint distribution will 
differ from the bright distribution. The aforementioned statistical correction should make 
the "corrected" fluence distribution Alog(S')corr = plog(5')i„ax — log(-S')obs more compatible 
with the bright distribution than is the uncorrected faint distribution. 

Figure 3 is a plot of log(S') vs. Iog(pio24) for the burst sample used in this study. 
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Class I, II, and III bursts have been identified using the unsupervised EM algorithm. The 
proportional decrease of fluence and peak flux is shown for a hypothetical Class I burst 
(diagonal line); the curving path indicates how the bias might affect the measured fluence 
of this burst as a function of P1024 (curving line) in the case where p = 1. The amount by 
which the fluence would need to be corrected Alog(5')corr is also shown (vertical line). 

We construct eight A log (5) bins for the set of bright bursts and eight bins for the faint 
bursts (the zero point for the A log(S') scale is arbitrary, so we use A log(S') = log(S') — 
logp]^Q24 +6). The dividing line between "bright" and "faint" bursts is set at Iog(j9io24) > 1 
photon cm^^ scc~^ since bursts brighter than this value should be essentially unaffected by 
the proposed ffuence duration bias. The faint uncorrected Alog(5') distribution is moder- 
ately different than the bright distribution, with a, — 13-8 for 7 degrees of freedom and 
a corresponding probability oi q — 0.055. The fluence distribution (as determined from 
Alog(5')) has been shifted to lower values consistent with the fluence duration bias. 

In order to test the correction by the proposed model, we correct the fluence of each of the 
i bursts by differing amounts pi log(5'max)- The oi the corrected faint burst distribution is 
again compared to the "control" sample of bright bursts. Since we might have overcorrected 
some bursts while undercorrecting others, we run the analysis a total of 100 times and 

average the results. The corrected A log(S')corr distribution is significantly different than the 
bright distribution (we obtain (x"^) = 34.0 for 7 degrees of freedom and a corresponding 
probability of g = 2 x 10^"'' that the two distributions are identical) indicating that our model 
has significantly overcorrected for the suspected bias. Similar results are obtained using the 
A log(T90) distributions. 

Since we have apparently overestimated the amplitude of the fluence duration bias 
Pi log(S')inax for typical bursts, we can decrease our estimate of the bias by introducing a free 
parameter D in the relationships piD log(S')iiiax and piD log(T90)max, where D = represents 
the uncorrected sample while D = 1 indicates the originally hypothesized bias. In table 2, 
we demonstrate the effectiveness of the fluence duration bias for different values of D chosen 
arbitrarily. The model flt is only improved when we reduce our estimates of log(5')inax and 
log(r90)inax signiflcantly. 

We have shown previously (Hakkila et al. 2000a) that the maximum time interval used 
to calculate burst fluences decreases dramatically near the BATSE detection threshold (es- 
sentially no bursts in the 3B catalog with pio24 < 0.4 photons cm""^ sec~^ have fluence 
durations > 100 seconds). This indicates that the fluence duration bias causes fluences and 
durations of very long bursts with small peak fluxes to be underestimated. Our current 
analysis supports this hypothesis: the A log(S') and A log(T90) distributions suggest that 
shorter bursts with small peak fluxes have probably not been affected by the bias, whereas 
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some longer bursts have. Our experience with BATSE data analysis procedures is also in 
agreement: fluence duration intervals are rarely chosen to be shorter than many tens of 
seconds, and the time histories of only a few bursts arc particularly susceptible to this bias 
(Koshut et al. 1996). This should prevent a systematic bias from being introduced for short 
bursts with small peak flux but not necessarily for long bursts with small peak flux. 

These results indicate that the fluence duration bias docs not influence faint bursts 
to the extent hypothesized previously. The shorter Class 1 bursts, which were originally 
thought most likely to take on Class 111 characteristics via the bias, are apparently affected 
the least. The fluence duration bias appears to primarily influence the properties of some 
longer BATSE bursts. We conclude that the fluence duration bias is not responsible for the 
large number of shorter softer bursts comprising Class III. 

3. Sample Incompleteness and the Duration Distribution 

Although the fluence duration bias does not appear to be responsible for the creation 
of Class 111, our analysis of the proportional decrease of fluence and peak flux has unexpect- 
edly provided new insight into measured burst properties. The faint fluence and duration 
distributions used in classification are truncated because the samples triggered using short- 
timescale peak fiuxes. This truncation has the potential of biasing the sample via sample 
incompleteness. In order to study the potential effects of sample truncation, we consider the 
advantages of a fluence-limited sample relative to a peak fiux-limited sample. 

An experiment triggering with a short integration window is more likely to detect a short 
burst than an experiment triggering with a long integration window, because in the latter 
case the entire burst fiux can be recorded in a single temporal bin. A peak fiux-limited 
sample is thus biased towards shorter bursts relative to longer bursts because it excludes 
longer bursts having large fluences but with peak fluxes too faint to trigger. However, a long 
timescale trigger (such as one that could trigger on fluence) would prove to be equally-biased. 
A hypothetical experiment triggering on fluence {e.g. one with a 10,000 second integration 
window) would be more likely to detect a faint long burst (having little of its fluence in 
one temporal window) than would an experiment triggering on peak fiux. A fiuence-limited 
sample would be biased towards longer bursts because it would include faint longer bursts 
but exclude faint shorter bursts with the same peak fiux. Figure 3 demonstrates that an 
excessive number of short Class III bursts are found near BATSE's peak fiux trigger; the 
fluence distribution of these bursts is acutely truncated by the peak flux trigger. Thus, Class 
III occupies a fluence vs. peak flux region where the instrumental (peak flux) trigger favors 
detection of shorter bursts over longer ones. 
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We would like to identify a peak flux measure that does not suffer from truncation of 
the duration distribution. The proportional relationship between fluence S and peak flux 
P1024 as a burst's luminosity is decreased or as its distance is increased provides a method 
for identifying such a peak flux measure. We can re-dcfinc fluence to be a peak flux by 
defining an extremely long temporal window r (r is a constant) that contains the entire 
flux of the sample's longest burst. The fluence divided by this temporal window {S/t) is 
a peak flux (having units of photons cm~^ sec~^ or ergs cm~^ sec~^, using an approximate 
transformation oi A !=i 2.24 x 10~^ ergs photon"^) (Hakkila et al. 2000b). The equation 
governing this proportional decrease in peak flux and fluence is 

2 log(^o) = log{S/{Ar)) + log(pio24) (1) 

or J-'q = S/{At){pio24) where jFg has units of flux and is thus a measure of burst intensity. 
We define this quantity as the dual timescale peak flux (Hakkila et al. 2002) since it uses 
two different timescale measurements. The minimum value of the dual timescale peak ffux 
J-'o can be called the dual timescale threshold. The dual timescale peak ffux is merely a 
multiple of this threshold value, log(jF) = log(jFo) + K (or — KTq), where X is a 
constant. A dual timescale threshold can be defined as an instrumental setting for a gamma- 
ray burst experiment (e. g. by requiring S/ (t)(pio24) to exceed a trigger value), as a selection 
process on previously-detected events in a standard experiment, or with archival data from an 
experiment triggering independently on one temporal trigger at a time (such as BATSE). This 
latter concept is not new; several studies have developed their own post facto BATSE triggers 
using archival time series data (Schmidt 1999; Kommers et al. 2001; Stern et al. 2001). 

The dual timescale peak fiux treats longer bursts having larger fiuences and smaller peak 
fiuxes on an equal basis with shorter bursts having smaller fiuences and larger peak fiuxes: 
these hursts with different temporal structures have something in common, which is that they 
have equal probability of detection using the dual timescale peak flux. Their differences must 
therefore be defined by a line orthogonal to the dual timescale peak fiux, e. g. one that 
satisfies the relationship 

log(r)=log(>S/A)-log(pio24). (2) 

01 T — S/ {Apiq2a) where F has units of time and represents a duration. We call F the flux 
duration F; it measures the total time that a burst could emit at its peak flux in order to 
produce its fluence. Longer bursts typically have large 5'/pio24 values and shorter bursts 
should typically have small S'/pio24 values. In fact, the correlation for Class I -|- III bursts 
demonstrated in Figure 4 has a Spearman Rank-Order correlation significance Of 10~^^^ that 
F and T90 are uncorrelated. 

The dual timescale peak fiux does not favor bursts of any duration (longer than the 
smallest 1024 ms integration window), whereas peak fiux or fiuence do by truncating the 
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distribution and thereby favoring "faint" bursts {e.g. those near the threshold) of longer 
or shorter durations. Since the dual timescale peak flux does not truncate the duration 
distribution, we can say that the dual timescale peak flux-limited sample retains the duration 
characteristics of the sample by preserving the duration S/piq2A relative to the intensity 

(5/r)(pi024). 

It was recognized soon after BATSE's launch (Pctrosian et al. 1994) that long-timescale 
triggers underestimated the intensities of shorter bursts and biased the sample. However, our 
analysis demonstrates (perhaps surprisingly) that short temporal timescale triggers would 
also bias the sample against longer bursts. 

Figure 5 demonstrates how BATSE's peak flux trigger influences the number of events 
placed in Class 111 relative to those placed in Class 1. Shorter bursts (small S/piQ2i\ denoted 
by region 'C') have been detected while faint longer bursts (large S'/pio24; denoted by 'A') 
have been excluded by the trigger. A hypothetical fluence trigger allowing the faintest shorter 
bursts currently detected by BATSE to trigger would not resolve this problem: shorter bursts 
(large S/pio24', denoted by 'B') would go undetected by the fluence trigger relative to longer 
bursts (small -S'/pio24; denoted by 'D'). A dual timescale threshold is shown (diagonal dashed 
hne) that favors neither longer nor shorter BATSE bursts. The threshold excludes most of 
the bursts previously identifled as Class III because these have been favored by the one- 
second trigger window relative to longer bursts. It also excludes many Class II bursts which 
are both typically faint and shorter than the one-second trigger window. 

We make a cut on our BATSE sample that is equally complete for both longer (large 
S/P1024) and shorter (small S'/pio24) bursts and use this as our dual timescale threshold. This 
threshold follows the relation log(S') + log(pio24) = —6.5, and has been chosen so that even 
the longest bursts with the largest S/pio24 values are detected by BATSE's actual peak flux 
trigger. This is demonstrated by the diagonal dotted line in Figure 4, and corresponds to 
a dual timescale peak flux (via equation 1) of J-'o = 0.048 photons cm^^ sec^^ for r = 617 
seconds (the T90 of the longest burst in the sample) . 

We wish to determine how the sample properties vary with dual timescale peak flux. We 
thus divide our sample of Class I + III into four subsamples containing similar numbers of 
bursts but with different dual timescale peak fluxes: bright bursts (log 5* -|- Iogj9io24 > —5.1, 
or {K) = 7.49), moderately bright bursts (-5.8 < log 5" -Mogpio24 < -5.1, or {K) = 3.34), 
faint bursts (—6.5 < logS + logpio24 < —5.8, or (K) — 1.49), and the faintest bursts 
(logos' -I- logpio24 < —6.5, or (K) — 0.66). The first three samples are "brighter" than the 
dual timescale threshold, the faintest sample consists of bursts fainter than the dual timescale 
threshold and is primarily composed of bursts from Class III. The bin sizes are chosen so 
that the three with J-' > J-'o contain similar numbers of bursts, and so that each bin contains 
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enough bursts to constitute a reasonable statistical sample. 

We identify three flux duration intervals from the sample: longer bursts (log 5" > 
logpio24 — 5.6), middle bursts (logpio24 — 6.1 < log 5" < logpio24 — 5.6), and shorter bursts 
(log 5" < logpio24 — 6.1). The bin sizes are again chosen so that each bin contains similar 
numbers of bursts, and so that each bin contains enough bursts to constitute a reasonable 
statistical sample. Longer bursts as measured by the flux duration ((F) = 20 seconds) are 
also long as measured by T90 ((T90) = 71 seconds); the same correlation holds true for 
middle bursts ((F) = 6.25 seconds and (T90) = 24 seconds) and shorter bursts ((F) = 2 
seconds and (T90) = 8 seconds). The quantity T90/F is the burst emission time relative to 
the flux duration; this is the amount by which the actual burst emission time is stretched 
relative to the time interval during which the burst could have emitted at the peak flux 
rate. It is interesting to note that (logF) log(T90)°-^ for Class I + III bursts. Bursts with 
log F and log(T90) values that do not closely follow this relationship have unconvential time 
histories (see Figure 4) . 

The attribute F is closely related to GRB duty cycle (Hakkila et al. 2000b). The duty 
cycle measures the persistence of burst emission via the relationship 

* = ? 

A • T90 ■ P64 

where A is the average energy per photon and p64 is the 64 ms peak flux. A large duty cycle 
(\E' ^ 1) indicates persistent emission whereas a small duty cycle \E' ^ indicates sporadic 
emission. Using equation (2), it can be seen that \E' ~ F/T90. Thus, a burst with a large 
F/T90 ratio is persistent because it emits at a high rate for a long time relative to its total 
duration. 

We have previously shown that Class II bursts have larger values of ^' and harder spectral 
indices than Class I and Class III bursts (Hakkila et al. 2000b), supporting the hypothesis 
that these short, hard bursts belong to a different source population. On the other hand. 
Class III bursts are generally softer than Class I but have similar ^ values; the properties of 
these two classes overlap considerably. 

If our hypothesis is correct that Class I + III comprises one population, then we expect 
that and F will deconvolve complex relationships previously measured with the attributes 
S and pio24- Figure 6 demonstrates the relationship between HR321 and for the combined 
Class I + III burst sample. A strong correlation exists between hardness and dual timescale 

peak flux for bursts of all durations; the hardness ratios are similar for all bursts of the same 
regardless of whether T90 or F is used. It is also seen that the faintest bursts in the 
sample (short bursts fainter than the dual timescale trigger) appear to extend this relation. 



(3) 



- 15 - 



This evidence supports our hypothesis that the bulk of the Class III bursts are short Class 
I bursts that have preferentially been detected by BATSE's short timescale trigger. 

If a corresponding sample of longer Class I bursts is detected (by having a lower peak 
flux trigger threshold and/or by having some bursts trigger on a longer timescale), then 
these bursts most likely would be as soft as the Class III bursts. We suggest that the FXTs 
(Fast X-ray Transients) found by BcppoSAX (Heise ct al. 2001) using an x-ray trigger and 
subsequently identified in BATSE data (Kippen et al. 2001) might be long soft bursts that 
previously escaped detection. 

The slope of the log(HR321) vs. log(peak flux) relation is largest when log(J^) is used 
as the peak flux measure as opposed to either log(5') or log(pio24); this is true regardless 
of whether the sample is peak flux-limited, fluence-limited, or duration-limited. This result 
is demonstrated in table 3, where hardness vs. intensity correlations are examined using a 
Spearman Rank-Order Correlation test for the three different intensity measures: the 1024 
ms peak flux pio24, the fluence S, and the dual timescale peak flux J^. Small probabilities 
indicate strong correlations between spectral hardness and the peak flux measure. Spectral 
hardness (and presumably redshift) correlates better with the dual timescale peak flux than 
with any other peak flux measure, regardless of which measure is used to select the sample. 
Furthermore, the log(HR321) vs. log(peak flux) slope is essentially identical for burst samples 
of different T90 durations when log(jF) is used; it does not appear that the same can be said 
when either log(pio24) or log(S') are used as a peak flux measure. Thus, T appears to more 
easily deconvolve the attributes of hardness, duration, and peak flux than do either S or pio24- 
We take this to indicate that JF is a preferred peak flux indicator to S and pio24- 

There are potentially far-reaching consequences to having JF as a less-biased temporal 
flux measure. To date, essentially all statistical studies have used either Iog(j9io24) or S as 
intensity measures (e.g. Iog(iV > 5") vs. log(S'), log(iV > P1024) vs. Iog(pio24), -E'pcak vs. 
Iog(pio24))- These studies are potentially biased because S and P1024 do not deconvolve the 
hardness intensity correlation as cleanly as does Presumably, studies made using S and 
P1024 combine longer bursts measured at one value of HR321 with short bursts measured 
at another value of HR321. The use of in future modeling endeavors might improve our 
understanding of gamma-ray burst properties better than do either fluence or peak flux. 

We test our hypothesis that Class I bursts and Class III bursts belong to the same 
population by submitting all bursts in the original sample brighter than to the EM 
algorithm for unsupervised classiflcation, and using the attributes of S, T90, and HR321. 

The classifier preferably recovers six classes as opposed to three; the original three-class 
structure is lost as a result of the new trigger. Despite this. Class II is still easily identifiable 
even though it contains only 40 members (Class II bursts in the BATSE Catalogs thus appear 
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to have been preferentially detected as a result of BATSE's short timescale trigger). The 
remaining bursts are placed in five classes with properties not recognizable as belonging to 
the original Class I or Class III. These classes may provide interesting additional insights 
into burst properties, but they warrant no further discussion here because they are not 
identifiable as the original burst classes. 

Thus, strong reasons exist that the Class III cluster arises primarily from the shape of 
the attribute space defined by BATSE's peak fiux trigger, and not from a separate source 
population. Our results support the hypothesis that Class III does not represent a separate 
source population. We have demonstrated that both flucncc and duration are truncated by 
BATSE's peak flux trigger. The truncation effectively oversamples short bursts relative to 
long bursts. As a result of this truncation, the database contains an excess of faint, short 
(soft) bursts. The use of the dual timescale trigger supports the hypothesis that Classes I 
and III are really one continuous duration distribution with faint bursts being softer than 
bright bursts. The properties of this continuous distribution become somewhat ambiguous 
at low signal-to-noise, where the fluence duration bias alters burst properties. 

On the other hand. Class II appears to represent a separate source population from 
Class I (Hakkila et al. 2000a). Neither samphng biases nor instrumental biases appear to be 
responsible for creating Class II characteristics from Class I bursts. However, it should be 
noted that BATSE's short trigger timescales have aided in the large detection rate of these 
short events. 



4. Conclusions 

We have demonstrated that 

1. Gamma-ray burst Class III does not have to represent a separate source population; 
it can be produced by the integration time of the instrumental trigger, 

2. the fluence duration bias by itself, as modeled from a sample of high signal-to-noise 
bursts, is unhkely to be responsible for the existence of Class III. 

3. Class III is likely produced by an excess of short, low fluence bursts detected by 
BATSE's short trigger temporal window. 

4. The excess bursts can be eliminated via a selection process that is dual timescale peak 
flux-limited, rather than peak flux-limited or fluence-limited. 



-17- 



5. The dual timescale peak flux measure resulting from this selection process appears 
to correlate better with hardness (and therefore with E'peak and redshift) than either 
peak flux or flucnce. This adds support to the argument that dual timescale peak 
fluxes correct the temporal limitations introduced by using single timescale peak fluxes. 
Dual timescale peak fluxes can be established for many combinations of temporal 
measurements. 

The results found here are important to gamma-ray burst astrophysics as well as to 
the general problem of scientific classification. Data mining tools can help identify complex 
clusters in multi-dimensional attribute spaces. The tools are sensitive to clusters and data 
patterns, as evidenced here because they have allowed us to discover clusters produced 
artificially as a result of sample incompleteness. This sensitivity is advantageous, because a 
better understanding of instrumental response and sampling biases can be used to improve 
the design of future instruments. 

We note that sample incompleteness is generic and applies to the detection of any 
transient sources identified as the result of a temporal trigger. Examples of transient event 
statistics that might be biased by a temporal trigger include fiare stars, soft gamma repeaters, 
x-ray bursts, and earthquakes. 

However, the sensitivity of data mining tools can also cause problems. Data mining is 
central to the operation of planned Virtual Observatories, which will electronically combine 
data collected from a variety of instruments with a range of temporal, spectral, and inten- 
sity responses. Since sample incompleteness can cause a single instrument with one set of 
characteristics to find phantom classes, classes identifled using multiple instruments should 
be interpreted cautiously. The instrumental responses of Virtual Observatory components 
will have to be accurately known in order for newly-identified classes to be recognized as 
separate source populations. 

It is important to recognize that data mining techniques have their limitations. Principal 
component analysis has identifled fluence, duration, and hardness as being critical gamma- 
ray burst classification attributes, while the trigger attribute of peak fiux was not chosen. 
Data mining classifiers failed to recognize that attribute selection had removed the attribute 
that could have provided the most insight into the gamma-ray burst clustering structure. 

We gratefully acknowledge NASA support under grant NRA-98-OSS-03 (the Apphed 
Information Systems Research Program) and NSF support under grant AST-0098499 (Re- 
search in Undergraduate Institutions). We also thank James Neff and Robert Dukes for 
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Fig. 1. — Hardness and duration properties found by forcing the unsupervised classifier ESX 
to find two classes using 798 bursts in a sample defined by homogeneous trigger criteria. 
The two-class structure has forced many bursts traditionally placed in the Long class to be 
reclassified as Short. 
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Fig. 2. — Fluence and duration properties found by forcing the unsupervised classifier ESX 
to find two classes. A sharper division exists between classes in the fluence vs. duration 
parameter space than in the hardness vs. duration parameter space. Since fluence is an 
extrinsic attribute, we conclude that the fluence attribute is biased. 
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Fig. 3. — Fluence vs. peak flux diagram showing the regions occupied by Class I (Long), 
Class II (Short), and Class III (Intermediate) bursts (as determined from the unsupervised 
EM algorithm). Maximum effects of the fluence duration bias are overlayed for a hypothetical 
Class I burst. The proportional fluence and peak flux decrease is shown (diagonal line) as 
is the maximum fluence decrease due to the bias (curving line). The maximum amount by 
which the fluence would need to be corrected Alog(S')corr is also shown (vertical line). 
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Fig. 4. — Demonstration of the strong correlation between flux duration T and duration T90 
for the combined sample of Class III (diamonds) and Class I (squares). 
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Fig. 5. — Effects of ffuencc and peak flux triggers on selection of gamma-ray bursts. Tlie 
BATSE 1024- ms trigger preferentially detects short bursts near threshold (region C), while 
missing longer bursts (region A). A hypothetical fluence trigger would preferentially detect 
long bursts (region D), while missing shorter bursts (region B). A proposed dual timescale 
threshold (which could be developed as an instrumental trigger on other experiments) would 
not oversample long or short bursts. 
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Fig. 6. — Spectral hardness vs. dual timescale peak flux JF/jFg (normalized to the dual 
timescale threshold) for a binned sample of Class I + III bursts. The longest bursts (di- 
amonds) have (r) = 20 seconds, bursts of moderate duration (squares) have (F) = 6.25 
seconds, and shorter bursts (asterisks) have (F) = 2 seconds. Faint bursts (as measured by 
J-') are softer than bright bursts regardless of duration (long bursts are denoted by diamonds, 
bursts of moderate duration are denoted by squares, and short bursts are denoted by aster- 
isks). The faintest bursts (short due to BATSE's short temporal trigger) are the softest of 
all. It appears that a faint sample of longer bursts (as measured by J^) should be as soft as 
the corresponding shorter bursts; these bursts require either a fainter peak flux trigger or a 
fluence trigger to be detected. 
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Table 1: Mean class properties when unsupervised classifiers are forced to recover three 
classes. Although each classifier produces different results, we refer to the similar recovered 
classes as Class I (Long), Class II (Short), and Class III (Intermediate). 



Class 


Property 


ESX 


Kohonen 


EM 


Kmeans 


Class I 


No. bursts 


250 


225 


422 


273 




log(fluence) 


-5.19 


-5.06 


-5.39 


-5.16 




log(T90) 


1.71 


1.71 


1.70 


1.78 




log(HR321) 


0.31 


0.31 


0.21 


0.26 


Class II 


No. bursts 


194 


239 


144 


173 




log(fluence) 


-6.71 


-6.63 


-6.76 


-6.72 




log(T90) 


0.02 


0.13 


-0.23 


-0.12 




log(HR321) 


0.44 


0.39 


0.58 


0.53 


Class III 


No. bursts 


354 


334 


232 


352 




log(fluence) 


-5.95 


-5.94 


-6.28 


-6.07 




log(T90) 


1.38 


1.51 


1.02 


1.28 




log(HR321) 


0.06 


0.07 


0.04 


0.06 
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Table 2: Comparison of bright (large P1024) burst distribution to the faint (small P1024) burst 
distribution (which is presumed to be biased by the fluence duration bias). The fluence 
of each burst is "corrected" for the assumed bias by an amount pj£)log(5')inax where pi 
represents a random probability that a burst has had its fluence underestimated by the bias 
and D is an overall amplitude of the bias {D — Q indicates no bias and D = 1 indicates a large 
bias). Each Monte Carlo model has been run 100 times and averaged, producing an average 
X^, (x^) s-iid a corresponding probability of exceeding x^, q. Although the fluences of faint 
bursts appear to have been underestimated in a manner consistent with the proposed bias 
(based on the D = model), the amplitude of the bias is inconsistent with that originally 
proposed (Hakkila et al. 2000a). The best flt amplitude {q ^ 0.1) is too small to account 
for the large number of faint bursts that have been placed in Class III. It also appears that 
bursts with large T90 values are more hkely than those with small T90 to have had their 
fluences underestimated, supporting the hypothesis that the fluence duration bias does not 
entirely explain the existence of Class III. 



D 




dof 


q (> X^) 


1 


34 


7 


2 X 10-5 


0.667 


23 


7 


2 X 10-3 


0.5 


19 


7 


10-2 


0.25 


13 


7 


7 X 10-2 


0.1 


12 


7 


0.11 





14 


7 


0.055 
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Table 3: Spearman Rank-Order Correlation probability that no correlation exists between 
hardness ratio HR321 and three different peak flux measures: the 1024 ms peak flux pio24, 
the fluence S, and the dual timescale peak flux JF. Small probabilities indicate strong 
correlations between spectral hardness and the peak flux measure. The results indicate that 
spectral hardness (and -Epeak! therefore presumably redshift) correlates better with the dual 
timescale peak flux than with any other peak flux measure, regardless of which measure is 
used to select the sample. Larger probabilities are found for the ^'-limited and ^-limited 
samples than for the ]9io24-liniited sample because these have been produced by trunctating 
data originally collected using the BATSE pio24-liniited sample. Note that S produces a 
smaller probability with HR321 than pio24 for a pio24-liniited sample; this is because the 
softest bursts have the smallest S due to the truncated shape of the sampled parameter 
space (e. g. region C in Figure 5). Similarly, pio24 produces a smaller correlation probability 
than S for a S-limited sample. 



Prob. of no correlation between HR321 and: 


Pl024 


S 








Pi024-liiiiited sample 


1.63 X IQ-^^ 


1.07 X 10^ 


'19 


9.94 X 10" 


-20 


jS'-limited sample 


2.58 X 10-^^ 


1.30 X 10" 


-10 


1.54 X 10- 


-12 


J^-limited sample 


3.14 X 10-^^ 


1.47 X 10- 


-17 


3.33 X 10- 
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